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Abstract 

Mixtures are convex combinations of laws. Despite this simple definition, a mixture 
can be far more subtle than its mixed components. For instance, mixing Gaussian 
laws may produce a potential with multiple deep wells. We study in the present work 
fine properties of mixtures with respect to concentration of measure and Sobolev type 
functional inequalities. We provide sharp Laplace bounds for Lipschitz functions in the 
case of generic mixtures, involving a transportation cost diameter of the mixed family. 
Additionally, our analysis of Sobolev type inequalities for two-component mixtures 
reveals natural relations with some kind of band isoperimctry and support constrained 
interpolation via mass transportation. We show that the Poincare constant of a two- 
component mixture may remain bounded as the mixture proportion goes to or 1 while 
the logarithmic Sobolev constant may surprisingly blow up. This counter-intuitive 
result is not reducible to support disconnections, and appears as a reminiscence of the 
variance-entropy comparison on the two-point space. As far as mixtures are concerned, 
the logarithmic Sobolev inequality is less stable than the Poincare inequality and the 
sub-Gaussian concentration for Lipschitz functions. We illustrate our results on a 
gallery of concrete two-component mixtures. This work leads to many open questions. 
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Mixtures of distributions are ubiquitous in Stochastic Analysis, Modelling, Simulation, and 
Statistics, see for instance the monographs [TBI 021 EH GHJ [50]. Recall that a mixture of 
distributions is nothing else but a convex combination of these distributions. For instance, 
if fiQ and //i are two laws on the same space, and if p 6 [0, 1] and q = 1 — p, then the law 
p\i\ + quo is a "two-component mixture" . More generally, a finite mixture takes the form 
Pi/J'i + • • • + PnfJ-n where m, . . . ,fj, n are probability measures on a common measurable 
space and p±5i + • • • +p n S n is a finite discrete probability measure. A widely used example 
is given by finite mixtures of Gaussians for which = Af(m,i,af) for every 1 ^ i ^ n. In 
that case, for certain choices of mi, . . . , m n and o~i, . . . , a n , the mixture 

PiAf(mi,al) H h p n N(m n , a 2 n ) 

is multi-modal and its log-density is a multiple wells potential. For instance, each compo- 
nent may correspond typically in Statistics to a sub-population, in Information Theory 
to a channel, and in Statistical Physics to an equilibrium. Another very natural example 
is given by the invariant measures of finite Markov chains, which are mixtures of the in- 
variant measures uniquely associated to each recurrent class of the chain. A more subtle 
example is the local field of the Sherrington-Kirkpatrick model of spin glasses which gives 
rise to a mixture of two univariate Gaussians with equal variances, see for instance |13j . 

At this point, it is enlightening to introduce a bit more abstract point of view. Let 
v be a probability measure on some measurable space and {[i>e)oe@ be a collection 
of probability measures on some common fixed measurable space X, such that the map 
i — ► E^ tfl / is measurable for any fixed bounded continuous / : X — > R. The mixture 
M{v, ^0ee) is the law on X defined for any bounded measurable / : X — > M by 

E M (^ee)/= / / f(x)df, e (x)dv(e)=E v (e^E fi J). 
JeJ x 

Here v is the mixing law whereas (^e)ee© are the mixed laws or the mixture components 
or even the mixed family. With these new notations, and for the finite mixture example 
mentioned earlier we have G = {1, . . . , n} and v = p\b~\ + • • • + p n o~ n an d 

M(v, (ne)eee) = M(pi6i H \-p n S n , • • • ,Mn}) = PiMi H \-p n V»n- 

The mixture M {v, /x# g e) can be seen as a sort of general convex combination in the convex 
set of probability measures on X . It appears for certain class of v as a particular Choquet's 
Integral, see |43] and [T7J. On the other hand, the case where the mixture components 
are product measures is also related to exchangeability and De Finetti's Theorem, see for 
instance [7j. In terms of random variables, if (X, Y) is a couple of random variables then 
the law C{X) of X is a mixture of the family of conditional laws C{X\Y = y) with the 
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mixing law C(Y). By this way, mixing appears as the dual of the so-called disintegration 
of measure. Here and in the whole sequel, the term "mixing" refers to the mixture of 
distributions as defined above and has a priori nothing to do with weak dependence. 

Our first aim is to investigate the fine behavior of concentration of measure for mix- 
tures, for instance for a two-component mixture pfj,% + qfio as min(p, q) goes to 0. It is well 
known that Poincare and (Gross) logarithmic Sobolev functional inequalities are powerful 
tools in order to obtain concentration of measure. Also, our second aim is to investigate 
the fine behavior of these functional inequalities for mixtures, and in particular for two- 
component mixtures. Our work reveals striking unexpected phenomena. In particular, 
our results suggest that the logarithmic Sobolev inequality, which implies sub-Gaussian 
concentration, is very sensitive to mixing, in contrast with the sub-Gaussian concentration 
itself which is far more stable. As in |20| and [3], our work is connected to the more general 
problem of the behavior of optimal constants for sequences of probability measures. 

Let us start with the notion of concentration of measure for Lipschitz functions. We 
denote by ||-|| 2 the Euclidean norm of M. d . A function / : R rf — > R is Lipschitz when 

\m-m\ „ 

Lip = SUp - < CO. 

Xy^y \\ x y\\2 

Let \x be a law on M. d such that E„|/| < co for every Lipschitz function /. This holds true 
for instance when fi has a finite first moment. We always make implicitly this assumption 
in the sequel. We define now the log-Laplace transform : M — > [0, co] of /i by 

a M (A)=log sup Bje^f-^). (1) 

ll/lkip^i v ' 

The Cramer-Chernov-Chebychev inequality gives, for every r > 0, 



M (r)= sup M (|/-E M /| >r) <2exp -sup(rA-a M (A)) (2) 

11/llLipCL V A>0 / 

and the supremum in the right hand side is a Fenchel-Legendre transform of a„. Note 
that is a uniform upper bound on the tails probabilities of Lipschitz images of \i. We 
are interested in the control of /3 M via a M in the case where [i = M{v, (ne)ee&)i i n terms 
of the mixing law v and of the log-Laplace bounds (a^ e )0ee f° r f ne mixed family. 

We say that [i satisfies a sub- Gaussian concentration of measure for Lipschitz functions 
when there exists a constant C G (0, co) such that for every real number A, 

q m (A) ^ \c\ 2 . (3) 
The log-Laplace-Lipschitz quadratic bound (J3J) implies via ([2]) that for every r > 0, 



/3 M (rK2exp ^-^Y 



(4) 



Actually, it was shown (see [15] and [9]) that up to constants, ([3]) and Q are equivalent, 
and are also equivalent to the existence of a constant ? G (0, co) and xq G U. d such that 
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e f l x - x °l n{dx) < co. (5) 
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Linear or quadratic upper bounds for may be deduced from functional inequalities such 
as Poincare and (Gross) logarithmic Sobolev inequalities [231 [23] ■ We say that /i satisfies 
a Poincare inequality of constant C G (0, oo) when for every smooth h : M. d — > R, 

Var^h) ^CB(\Vh\ 2 ) (6) 

where Var^h) = E^(/i 2 ) - (E^/i) 2 is the variance of h for fi. The smallest possible 
constant C is called the optimal Poincare constant of \x and is denoted Cpi(/i) with the 
convention inf0 = oo. Similarly, /j, satisfies a (Gross) logarithmic Sobolev inequality of 
constant C S (0, oo) when 

Ent M (/i 2 ) < CE(|V/t| 2 ) (7) 

for every smooth function / : R d -» R, where Ent^(/i 2 ) = E At (/i 2 log h 2 )-E^(h 2 ) log E^/i 2 ) 
is the entropy or /ree energy of /i 2 for with the convention 01og(0) = 0. As for the 
Poincare inequality, the smallest possible C is the optimal logarithmic Sobolev constant of 
\x and is denoted Cci(fi) with inf = oo. Standard linearization arguments give that 

pO^KCpiGuK^CgiGu) (8) 

where p{Ku) stands for the spectral radius of the covariance matrix Kn of [i defined by 
{Kp)ij = E /J (xjX J ) — E fl (xi)'E fl (xj) where X{ and Xj are the coordinate functions. More 
precisely, the first inequality in ([8]) follows from Q by taking h = (•,«) where u runs over 
the unit sphere while the second inequality in ([5]) follows by considering the directional 
derivative of both sides of ([7]) at the constant function 1. 

A basic example is given by Gaussian laws for which equalities are achieved in (jSJ). A 
wide class of laws satisfy Poincare and logarithmic Sobolev inequalities. Beyond Gaussian 
laws, a criterion due to Bakry & Emery [21 [1] (see also [12], [MU2], and [TOl E]) states 
that if \x has Lebesque density e~ v on M. d such that x i— > V(x) — j^\x\ 2 is convex for some 
fixed real k > then Cpi(^) ^ k and Cqi(^) ^ 2k with equality in both cases when /j, is 
Gaussian. This log-concave criterion appears as a comparison with Gaussians. Note that 
in general, Cgi(m) < oo implies Cpj(fi) < oo but the converse is false. For instance, the law 
with density proportional to exp(— |x| a ) on R satisfies a Poincare inequality iff a ^ 1 and 
a logarithmic Sobolev inequality iff a ^ 2, see e.g. [H Chapter 6]. Note also that if \x has 
disconnected support, then necessarily Cpi(fi) = Cqi(/x) = oo. To see it, consider a non 
constant h which is constant on each connected component of the support of [x. This is for 
instance the case for the two-component mixture fi = p^i + quo = M.(jp8\ + q5o, {/Mo, Mi}) 
with p £ (0, 1) and q = 1 — p where hq and [i\ have disjoint supports. 

The logarithmic Sobolev inequality ([7]) implies a sub-Gaussian concentration of mea- 
sure for Lipschitz images of \i. Namely, using ([JJ) with h = exp(^A/) for a real number 
A and a smooth Lipschitz function / : M. d — > R gives via Rademacher's Theorem and a 
standard argument attributed to Herbst [34, Chapter 5] that for any reals A and r > 

a„(A) < \c Gl {ii)\ 2 and ^(r) < 2 exp (-^7x) ■ (9) 

The same method yields from ([6]) a sub-exponential upper bound for f3^ of the form 
ci exp(— C2r) for some constants c\,C2 > 0, see for instance [22] and [331 Section 2.5]. 

Both Poincare and logarithmic Sobolev inequalities are invariant by the action of the 
translation group and the orthogonal group. More generally, let us denote by / • \x the 
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image measure of /i by the map /. Both ([6]) and ([7J) are stable by Lipschitz maps in the 
sense that C P i(f ■ /i) ^ H/Hlip^piO-O and Cgi(/ • m) < II/IIlip^GiCm). On the real line, 
Cpi and Cgi can be controlled via "simple" variational bounds such as (fT8|) . Both Q and 
(|7|) are also stable by bounded perturbations on the log-density of fi, see [57], [25], and 
[3] for further details. In view of sub-exponential or sub-Gaussian concentration bounds, 
the main advantage of © and ([7J over a direct approach based on a M or (3^ lies in the 
stability by tensor products of §§§ and ([7J, see e.g. [TJ Chapters 1 and 3], [9], and [2T] . 

The case of mixtures. The integral criterion ([5]) shows that if the components 
of a mixture satisfies uniformly a sub-Gaussian concentration of measure for Lipschitz 
functions, and if the mixing law has compact support, then the mixture also satisfies a 
sub-Gaussian concentration of measure for Lipschitz functions. Such bounds appear for 
instance in [Bj. However, this observation does not give any fine quantitative estimate on 
the dependency over the weights for a finite mixture. Regarding Poincare and logarithmic 
Sobolev inequalities, it is clear that a finite mixture of Gaussians will satisfies such in- 
equalities since its log-density is a bounded perturbation of a uniformly concave function. 
Here again, this does not give any fine control on the constants. 

An upper bound for the Poincare constant of univariate finite Gaussian mixture was 
provided by Johnson [2jJl Theorem 1.1 and Section 2]. Unfortunately, this upper bound 
blows up when the minimum weight of the mixing law goes to 0. A more general upper 
bound for finite mixtures of overlapping densities was obtained by Madras and Randall 
[35] Theorem 1.2 and Section 5]. Here again, the bound blows up when the minimum 
weight of the mixing law goes to 0. Some aspects of Poisson mixtures are considered by 
Kontoyannis and Madiman [30} I5T] in connection with compound Poisson processes and 
discrete modified logarithmic Sobolev inequalities. 

Outline of the article. Recall that the aim of the present work is to study fine 
properties of mixture of law with respect to concentration of measure and Sobolev type 
functional inequalities. The analysis of various elementary examples shows actually that 
such a general objective is very ambitious. Also, we decided to focus in the present work 
on more tractable situations. Section [2] provides Laplace bounds for Lipschitz functions 
in the case of generic mixtures. These upper bounds on (and thus (5^) for a mixture \i 
involve the VFi-diameter (see Section [2] for a precise definition) of the mixed family. Sec- 
tion [3] is devoted to upper bounds on for two-component mixtures \x = \x v = pfi\ + q/J-o- 
Our result is mainly based on a Laplace-Lipschitz counterpart of the optimal logarithmic 
Sobolev inequality for asymmetric Bernoulli measures. In particular, we show that if /xo 
and /ii satisfy a sub-Gaussian concentration for Lipschitz functions, then it is also the 
case for the mixture /i p , with a quite satisfactory and intuitive behavior as min(p, q) goes 
to 0. In Section^ we study Poincare and logarithmic Sobolev inequalities for two com- 
ponents mixtures. A decomposition of variance and entropy allows to reduce the problem 
to the Poincare and logarithmic Sobolev inequalities for each component, to discrete in- 
equalities for the Bernoulli mixing law p5\ + q5o, and to the control of a mean-difference 
term. This last term can be controlled in turn by using some support-constrained trans- 
portation, leading to very interesting open questions in dimension > 1. The Poincare 
constant of the two-component mixture can remain bounded as min(p, q) goes to 0, while 
the logarithmic Sobolev constant may surprisingly blow up at speed — log(min(p, q)). This 
counter-intuitive result shows that as far as mixture of laws are concerned, the logarith- 
mic Sobolev inequality does not behave like the sub-Gaussian concentration for Lipschitz 
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functions. We also illustrate our results on a gallery of concrete two-component mixtures. 
In particular, we show that the blow up of the logarithmic Sobolev constant as min(p, q) 
goes to is not necessarily related to support problems. 

Open problems. The study of Poincare and logarithmic Sobolev inequalities for 
multivariate or non-finite mixtures is an interesting open problem, for which we give some 
clues at the end of Section 0] in terms of support-constrained transportation interpolation. 
There is maybe a link with the decomposition approach used in [25] for Markov chains. 
One can also explore the tensor products of mixtures, which are again mixtures. Another 
interesting problem is the development of a direct approach for transportation cost and 
measure-capacities inequalities (see [5]) for mixtures, even in the finite univariate case. 



2 General Laplace bounds for Lipschitz functions 

Intuitively, the concentration of measure of a finite mixture may be controlled by the worst 
concentration of the components and some sort of diameter of the mixed family. We shall 
confirm, extend, and illustrate this intuition for a non necessarily finite mixture. The 
notion of diameter that we shall use is related to coupling and transportation cost. Recall 
that for every k ^ 1, the Wasserstein (or transportation cost) distance of order k between 
two laws fi\ and \i 2 on IR d is defined by (see e.g. [5lJ[52] and [311 ITT]) 

W h (pi,H2) = inf ( f \x-y\ k dTr(x,y)) (10) 



where ir runs over the set of laws on R d x M. d with marginals /xi and ^2 ■ The PV^-convergence 
is equivalent to the weak convergence together with the convergence of moments up to 
order k. In dimension d = 1, we have, by denoting F% and F2 the cumulative distribution 
functions of \x\ and hi, with generalized inverses F^ 1 and F% , for every k ^ 1, 

W k (fi 1: H2) k = f \F^(x)-F-\x)\ k dx and Wi^, M2) = / |Fi(x) - F 2 (x)\ dx (11) 
where the last expression of W\ follows from the Kantorovich-Rubinstein dual formulation 
W x {in^2)= sup ( / fdptx- f fdfi 2 ). (12) 

ll/llLipSjl \JR d J®. d J 

Note that if m does not give mass to points then ^2 = (-^T 1 ^1) ' Ml- The transportation 
cost distances lead to the so called transportation cost inequalities, popularized by Marton 
I3SH32], Talagrand 09], and Bobkov & Gotze [5J. See for instance the books [SI EH [52] 
for a review. The link with concentration of measure was recently deeply explored by 
Gozlan, see e.g. [21] , We will not use this interesting line of research in the present paper. 

Theorem 2.1 (General Laplace-Lipschitz bound via diameter). Let jj, = J\A(v, (^e)eee) 
be a general mixture. If this mixture satisfies the uniform bounds 

a = sup ctg < 00 and W = sup Wi(fJ,g, < 00 
6»eO 0,0'se 

then for every A > we have 

aJ\) < 5(A) + - min (8WX, W 2 \' 
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Proof of Theorem \2.l\ The key point is that if H/HLip ^ 1 then for every A > 0, 

Mf^l = g -AE M / j ^ LX A v{de) ^ [ e a fl (A) + A(E Me /-E M /) {u) 

As a consequence, we get 

a M (A)<a(A) + sup log [ e x ^ef- E »f) u (d9). (14) 
11/llLip^i -/e 

Thanks to the relation (|12p . we obtain 

E Me /-E M /= / (E w /-E w /) z/(d0') 
ie 

^ / W x {iio, w)v{d6') < W. 

This shows that the second term in the right hand side of (|14p is bounded by WA. Alter- 
natively, one can use the Hoeffding bound [26J which says that if X is a centered bounded 
random variable with oscillation c = sup X — inf X then 

E(e AX ) < e^ 2 . 

The desired bound in terms of W 2 A 2 follows by taking X = E^y/ — E M / where Y ~ 
and noticing that c < sup^ ^ (E Ma / - E Me ,/) = □ 

Example 2.2 (Finite mixtures). For a finite mixture fi = pi/J,\+- ■ ■+p n l JL n = <M.(y, (fi>i)i^i^ n ) 
where v = p\5i + • • • + p n $n, the mixing measure u is supported by a finite set. In that 
case, Theorem \2.1\ gives an immediate Laplace bound, involving the worst bound for the 
mixture components (ni)i^i^ n (this cannot be improved in general). However, in Section 
we provide sharper bounds by improving the dependency over v in the case where n = 2. 

Example 2.3 (Bounded mixtures of multivariate Gaussians). Here fig = J\f(m(9),T(6)) 
where m : — > M. d and V : M. d — > Sf are two measurable bounded functions and 5j~ is the 
cone of symmetric nonnegative dxd matrices. Note that T(6) is allowed to be singular i.e. 
not of full rank. The spectrum ofT{9) is real and non-negative. If Xi(6) • • • ^ \d(8) are 
the eigenvalues ofY{9), we define p = sup 0g0 \\{0) = sup ege ||r(0)|| 2 _ > 2- ^ ow fi x some 
mixing law v on 6 and consider the mixture \i = M.{y, (Me)g e e)- Then for every A > 0, 

M X ) < S a2 + I min(8WA,W 2 A 2 ). 
2 o 

One can deduce an upper bound for W from the following lemma. 

Lemma 2.4 (Wi-distance of two multivariate Gaussian laws). Let fio = jV(m(0), T(0)) 
and pi = A/"(m(l), r(l)) be two Gaussian laws on M. d . For 9 G {0, 1}, we denote by 

\i(9) > ■ ■ ■ > \d(0) 
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the ordered spectrum ofT(9) and by {vi{0)) 1<i<d an associated orthonormal basis of eigen- 
vectors. Assume, without loss of generality, that Uj(0) ■'Uj(l) for every 1 ^ i ^ d where 
"■" stands for the Euclidean scalar product ofM. d . Then Wi(/j,o, fii) is bounded above by 



d 

|m(l)-m(0)| + 



^ E { (VMl) - + 2 v / A i (l)A i (0)(l - Vi(l) ■ ^(0))|. 

The reader may find in |48[ Theorem 3.2] a formula in the same spirit for W^CA'O) A*i)- 
Proof of Lemma \2.J\ The triangle inequality for the Wj distance gives 



Wiifio, Mi) < Wi(^o, AA(m(l), r(0))) + Wi(JV(m(l), r(0)), /xi) 
s$ |m(l) - m(0)\ + WiCAT(0, r(0)), AA(0, T(l))). 

Now, if (Yi) 

i<j<d are real random variables of law M(0, 1) then the law of 

d 

X e = Y J YiVW)<0) 
i=i 

is Af(0,T(9)) for 9 G {0, 1}. Moreover, from (fTUj) and Jensen's inequality, we get 

^!(AA(0,r(0)),AA(0,r(l))) 2 ^ (ElXx-Xol) 2 ^EdXx-Xol 2 ). 
At this step, we note that 



\X X -X \ 2 =^^ 2 |v / A~(l)^(l) - v^(0)^(0)| 
i=i 

+ 2^7^(7^1)^(1) - V^(0)ui(0)) • (yW)Vi{l) ~ ^/W)v^(0)). 
Since (Y{) are i.i.d. A/"(0, 1) and {vi{0)) l<i<d is orthonormal for £ {0, 1}, one has 
E(|Xi - X | 2 ) = ^ |7A^1)^(1) - ^/X0)<O)\ 

i=l 

= E { (VM1) - VW)) 2 + 2 V / A i (l)\(0)(l - W< (1) • Vi (0))\. 
i=i ^ ' 



□ 



Of course the assumptions of Theorem 12.11 may be relaxed. Instead of trying to deal 
with generic abstract results, let us provide some highlighting examples. 

Example 2.5 (Gaussian mixture of translated Gaussians). Here = R and fig = M{9, a 2 ) 
for some fixed a > 0, and the mixing law is also Gaussian v = M(0,t 2 ) for some fixed 
t > 0. In this case, a(X) = \o~ 2 }? but W is infinite since 

Wx^w) = \0-0'\. 
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In particular, Theorem \2.1\ is useless. Nevertheless, the function 

9^g(9) = B fl J-B fl f 

is Lipschitz since 

\g(9) - g(0')\ ^ E(|/(X + 9) - f(X + 9')\) < \9 - 9'\ 
where X ~ Af(0, 1). As a consequence, we get 

sup log / e ^(EM 9 /-E M /) v ^ ^ Z_^_ ; 
11/llLip^i ie 2 

and for any A > 

«„(A) < — ^— — A 2 - 

T/ie same argument may be used more generally for "position" mixtures. For instance if 
r] is some fixed probability measure on M. d and \x$ = r\ * 5q for 9 E R d then VA > 0, 

a M (A) < a v (\) + a M (A). 

In i/iis particular case, fi = v * rj and the bound above follows also by tensorization! 

Example 2.6 (Mixture of scaled Gaussians: from exponential to Gaussian tails). Here 
we take G = [0, oo) and [iq = J\f(0, 9 2 ) with a mixing measure v of density 

1 



— I— exp (-9^)1^(9) 



where 7 ^ 2 is some fixed real number. Note that v has a non- compact support and that 
\x does not satisfy the integral criterion ([5]). This means that fi cannot have sub- Gaussian 
tails. Note also that both a(X) and W are infinite since 

9 2 X 2 [2 
a eW = and Wi{fio, ne') = \ - \8 - 0'\ 

Z V 7T 

where we used (fTTj) for W\. Starting from (fT3j) . one has by Cauchy-Schwarz's inequality 



< I e 92x \(d9) [ jWnf-W v {dB). (15) 
Je Jq 

Note that v satisfies condition (0) and a u (X) ^ CX 2 for some real constant C > 0. Here 
and in the sequel, the constant C may vary from line to line and may be chosen independent 
of 7. On the other hand, the centered function g{9) = B^ g f — E^/ is 1-Lipschitz since 

\g{8) - g(9')\ = |E/(0X) - Bf(9'X)\ < \9 - 9'\B(\X\) 

where X ~ N(0, 1). Also, for the second term in the right hand side of (|15p we have 



e 2A(E w /-E M /) I/(tW)<e a„(2A) <e 4CA 

e 
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7/7 = 2 then a M (A) < 2CA 2 - | log(l - A 2 ) < 2C—\ log(l - A) if\<l, which gives, after 
some computations, the deviation bound, for some other constants C > and C" > 0, 

M (F-E„/^)^'e- c ' V . 

Assume in contrast that 7 > 2. Since 9 2 X 2 ?J 7~ 1 # 7 + CoAf- 2 /or some constant Cq > 
which may depend on 7 6ui not on A and we cei, /or some constants C\ > and C2 > 0, 

r 00 / 2 \ 

y exp (# 2 A 2 ) i/(d0) < Ci exp \C 2 X^J. 

27 

T/ns gives CKu(A) ^ C3AT- 2 + C4 /or some constants C3 > and C4 > 0, which yields a 
deviation bound of the form (for some constants C5 > and Cq > 0) 

u(/ - E„/ ^ r) < C 5 exp (-C7 6 r 2 "^2 ) . 

Note that v goes to the uniform law on [0, 1] as 7 — > 00 and the Gaussian tail reappears. 



3 Concentration bounds for two- component mixtures 

In this section, we investigate the special case where the mixing measure v is the Bernoulli 
measure B{jp) = p8\ + q5o where q = 1 — p. We are interested in the study of the sharp 
dependence of the concentration bounds on p, especially when p is close to or 1. 

Theorem 3.1 (Two-component mixture). Let fj,Q and [i\ be two probability measures on 
X and \x = pfi\ + quo with p £ [0, 1] and q = 1 — p. Define x p = max(p, q)/(2c p ) where 

_ Q-P 

Cp 



4 (log (9) - log(p)) 

with the continuity conventions Ci/ 2 = 1/8 and Co = c\ = 0. Then for any A > ; 

c p X 2 Wi(fi , Mi) 2 «/AWi(/i , Hi) ^ Xp 



a M (A) < max(a Mo ,a jUl )(A) + < 



max(p, q)[ XW\(^lq, jii) — -x p ) otherwise. 



Note that if min(p, q) — ► 0, then c p ~ — (41og(p)) _1 — > and x p — > 00, and we thus 
recover an upper bound of the form ^ max(a fll ,a (i2 ) as min(p, q) — > 0, which is 
satisfactory. The two different upper bounds given by Theorem 13.11 provide two different 
upper bounds for the concentration of measure of the mixture fj,, illustrated by the following 
Corollary (the proof of the Corollary is immediate and is left to the reader). 

Corollary 3.2 (Two-component mixtures with sub-Gaussian tails). Let /xq and \i\ be two 

probability measures on X and [i = p\i\ + quo for some p € [0, 1] with q = 1 — p. If there 
exists a real constant C > such that for any A > 

max(a M0 ,a jUl )(A) < ^CA 2 
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then for every r ^ 0, with W = W\(fio, Mi), 
( J2 \ 



W < 2 { 



exp 



exp 



2C + 4c p W 
1 



2C 



(r — max(p, f?)!^)' 



max(p, g) z 
4c„ 



if r ^ max(p, q) 



otherwise. 



c 



+ W 



Proof of Theorem \3. 1[ We have \x = qfj-o + PfJ>i = ■M.(f, {fJ,o, Ml}) where z/ := gcJo + p#i- 
For this finite mixture, we get, as in the general case, for any / 6 Lip(A',R) and A > 0, 



log 



g AE M / 



< max(a w , a w ) ( A) + log 



where g(i) := E^/. At this step, we use the particular nature of v, which gives 

B u (e x 9) 



cosh p (A( ff (l) - g(0))), 



where cosh p (x) := pe qx + qe px . Since g(l) — g(0) = E w / — E w /, we get by (fT2 

-Wi(Mo,Mi) ^ 5(1) -5(0) < ^(McMl)- 
Since cosh p (— x) = cosh g (x) for any x E R, we get for any A > 0, 



sup 

filial 



max (coshp, cosh q )(XWi(/j,o, Mi))- 



Putting all together, we obtain, for any A > 0, 

a M (A) < max(a Ai0 ,a Ail )(A) + log max (cosh p , cosh g )(AWi(Mo, Mi)), 
Since (cosh g — cosh p )'(x) = 2pq(cosh.(px) — cosh(gx)), one has, for every x ^ 0, 

max (coshp, cosh g )(x) = cosh min ( Pjg )(x). 
Let us assume that p ^ q. Lemma 13.31 ensures that, for every x ^ 0, 
log max (coshp, cosh g )(x) = logcoshp(x) ^ c p x 2 . 

On the other hand, 

logcoshp(x) = qx + log (p + ge -1 ) ^ qx. 

Now, for x = x p , the slope of x 1— > c p x 2 is equal to q and the tangent is y = q(x — x p /2). On 
the other hand, the convexity of x 1— > log cosh p (x) yields log cosh p (x) ^ q(x — x p ) for x ^ x p 
(drawing a picture may help the reader). The desired conclusion follows immediately. □ 

The proof of Theorem 13 . 1 1 relies on Lemma [3.3l below which provides a Gaussian bound 
for the Laplace transform of a Lipschitz function with respect to a Bernoulli law. This 
lemma is an optimal version of the Hoeffding bound [26] in the case of a Bernoulli law. 
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Lemma 3.3 (Two-point lemma). For any ^ p ^ 1/2, we have 

sup x- 2 ]og(pe** + qe~n = c p = \~ P (16) 

x>o 4(log(g) - log(p)) 

tuit/i t/ie natural conventions Co = and ci/2 = 1/8 as in Theorem \ 3.1l Moreover, the 
supremum in x is achieved for x = 2(log(q) — log(p)). 

The constant c p is also equal, as it will appear in the proof, to sup A>0 og( p )(A)/A 2 . 
The classical Hoeffding bound for this supremum is c 1 / 2 = 1/8 which is the maximum of 
c p over p. Additionally, the quantity l/(4c p ) is the optimal constant of the logarithmic 
Sobolev inequality for the asymmetric Bernoulli measure q5o + p5\ (see Lemma l4.ip . 

Proof of Lemma UTSi . Let us define x p = log(q/p) and (3{x) = x~ 2 ip(x) where 

ip(x) = log(pe gx + qe~ px ). 

The function ip is "strongly convex" at the origin (ip(0) = ip'(0) = and ip"(Q) = pq and 
4>'"(0) > 0) and linear at infinity (ip'(oo) = q). Therefore, the supremum of (3 is achieved 
for some x > 0. The derivative of (5 has the sign of j(x) := xip'(x) — 2ip(x). Furthermore, 

7 (x) = xtp"{x) — if) (x) and 7 '(a;) = xip"'(x). 

As a consequence, 7" has the sign of ip'" which is positive on (0, x p ) and negative on 
(x p , +00). Since 7'(0) = and 7' achieves its maximum for x = x p and 7' goes to — q at 
infinity and there exists an unique y p > (in fact y p > x p ) such that j'(y p ) = 0. As a 
conclusion, since 7(0) = and 7 is increasing on (0, y p ) and 7(2;) goes to —00 as x goes to 
infinity, j(x) is equal to zero exactly two times: for x = and x = z p > y p > x p In fact, 
z p is equal to 2x p . Indeed, we have 

ip'(x) = pq- 



pe qx + qe px 
Now, we compute 

ip [2x p) - Pd p{q/p)2q + q{p/q)2p ~ -1 PIP, 

and 

2iP(2x p ) = 2\og{p(q/p) 2q + q(p/q) 2p ) 
= 2log((q+p)(q/ P y-P) 
= 2x p iP'(2x p ). 

Thus, 2x p is (the unique positive) solution of 2ip(x) = xip'(x). As a conclusion, we get 
c p = if)(2xp)/(4x 2 ), which gives the desired formula after some algebra. □ 

Remark 3.4 (Advantage of direct Laplace bounds). Consider a mixture fx = p[i\ + q/J.o of 
two Gaussian laws /jLq and fi\ on M with same variance a 2 and different means. Corollary 
ensures that for every r ^ 0, 



Pn(r) ^ 2exp 



r 2 



2ct 2 + 4cpWi(po, m)' 
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This bound remains relevant as a — > since we recover the bound for the Bernoulli mixing 
law v = p5\ + q5o . On the other hand, any concentration bound deduced from a logarithmic 
Sobolev inequality would blow up as a goes to zero, as we shall see in Section^ 

Remark 3.5 (Inhomogeneous tails). It is satisfactory to recover, when p goes to (resp. 
1), the concentration bound of (resp. \i\) and not only the maximum of the bounds of 
the two components. It is possible to exhibit two regimes, corresponding to small and big 
values of X. Assume that m = M(0, Of) for i £ {0, 1} with 9 1 > #o > 0. Theorem \2.1\ gives 



efx 2 

2 



MA) < "V + (01 - 0o)A. 



On the other hand, one has 
EJe^ 



where 



H x (6) = a IM0 (X) - J a Mfl ,(A) u(d9') and g{9) = E„J - Erf. 
Then, Lemma \3.3\ ensures that for every e > 0, 

log J e ^)+A 9W v{de) ^ Cp ( Hx (l) + Xg(l) - H X (Q) - Xg(0)) 2 

^ J-lHxil) - H x (0)\ 2 + s\Xg(l) - Xg(0)\ 2 ^j 

Choosing e = X leads to 

log J u (d9) ^ c p ^ 62l ~f o)2 + (9 1 - O ) 2 ) A 3 . 

As a conclusion can be control by (at least) these two ways: 

<9\X 2 



OL^X) SC < 



2 



+ (0i - 0o)A, 



j2._i_,,/j2\2 {,,_)!._. tiKl 



2 + qe2 ° x \cJ'-^p^ + (9 1 -e ) 2 )x* 



The second one provides sharp bounds for X ^ f(l/c p ) whereas the first one is useful for 
X ^ /(1/cp) (where f is an increasing function which is computable). 



4 Gross-Poincare inequalities for two- component mixtures 

It is known that functional inequalities such as Poincare and (Gross) logarithmic Sobolev 
inequalities provide, via Laplace exponential bounds, dimension free concentration bounds, 
see for instance [33]. It is quite natural to ask for such functional inequalities for mixtures. 
Before attacking the problem, some facts have to be emphasized. 
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As already mentioned in the introduction, a law /i with disconnected support cannot 
satisfy a Poincare or a logarithmic Sobolev inequality. In particular, a mixture of laws with 
disjoint supports cannot satisfy such functional inequalities. This observation suggests that 
in order to obtain a functional inequality for a mixture, one has probably to control the 
considered functional inequality for each component of the mixture and to ensure that the 
support of the mixture is connected. It is important to realize that such a connectivity 
problem is due to the peculiarities of the functional inequalities, but does not pose a 
real problem for the concentration of measure properties, as suggested by Theorem 13.11 
and Remark 13.41 for instance. In the sequel, we will focus on the case of two-component 
mixtures, and try to get sharp bounds on the Poincare and logarithmic Sobolev constants. 
The two-component case is fundamental. The extension of the results to more general 
finite mixtures is possible by following roughly the same scheme, see Remark 14 . 2 1 b elow . 

For the logarithmic Sobolev inequality of two-component mixtures, we will make use 
of the following optimal two-point Lemma, obtained years ago independently by Diaconis 
& Saloff-Coste and Higushi & Yoshida. An elementary proof due to Bobkov is given by 
Saloff-Coste in his Saint-Flour Lecture Notes [45]. 

Lemma 4.1 (Optimal logarithmic Sobolev inequality for Bernoulli measures). For every 
p £ (0, 1) and every f : {0, 1} — > R, and with the convention (log(g) — log(p))/(q — p) = 2 
if P = Q = 1/2; we have 



Moreover, the function of p in front of the right hand side cannot be improved. 

Note that the constant in front of the right hand side of the inequality provided by 
Lemma 14.11 is nothing else but pq/(4c p ) where c p is as in Theorem 13.11 and Lemma 13.31 
At this stage, it is important to understand the deep difference between the Poincare and 
the logarithmic Sobolev inequalities at the level of the two-point space. On the two-point 
space, the Poincare inequality turns out to be a simple equality, and Lemma 14. II is in fact 
an entropy-variance comparison. Namely, for every p £ (0, 1) and / : {0, 1} — > M, 



This inequality is optimal and (log(q) — log(p))/(q — p) tends to +oo as min(p, q) goes to 
0. Also, for strongly asymmetric Bernoulli measures, the entropy of the square can take 
extremely big values for a fixed prescribed variance. This elementary phenomenon helps to 
better understand the surprising difference in the behavior of the Poincare and logarithmic 
Sobolev constants of certain two-component mixtures exhibited in the sequel. Moreover, 
this observation suggests to use asymmetric test functions inspired from the two-point 
space in order to show that the logarithmic Sobolev constant may blow up when the 
mixing law is strongly asymmetric. We shall adopt however another (quantitative) route. 
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4.1 Decomposition of the variance and entropy of the mixture 

Let /iQ and \i\ be two laws on R rf , p G [0, 1], q = 1 — p, v = p<5i + g<5o, and /x p = pfi\ + q/j,Q. 
Then, one can decompose and bound the variance of / : R rf — » R with respect to \i v as 

Var^(/) = E„(0 -> Var w (/)) + Var„(0 ^ E w /) 
= E„(0 ^ Var w (/)) +pg(-E flo f - E Ml /) 2 
< max(C7pi( M o),Cpi( m ))E / ,(|V/| 2 ) + pq(E fJ-0 f — 'E fll f) 2 . 

For the entropy, let us write 

Ent^/ 2 ) = E„(0 ^ Ent^(/ 2 )) + Ent„(0 ^ E Me (/ 2 )) . 



Applying Lemma 14.11 to the function 8 -y/E^ (/ 2 ), one gets 



Since E M (/)E W (/) « v'Ef.W™^). we h »™ 



2 



E M0 (/2) - ^E w (/2)j =E M0 (/ 2 ) + E m (/ 2 ) - 2^E M0 (/2)E m (/2) 

<Var M0 (/) + Var w (/) + (E Mo / - E m /) 2 . 

(Note that the right hand side is not equal to zero if = fii). Using the Poincare 
inequalities for hq and Hi provides the following control of the entropy: 

Ent^ {f) < max(C G i(Mo), C G i(^))E M (|V/| 2 ) 
pg(logg-logp) 2 

H — (No/ - *W) 

+ max(Cp I ( / xo),Cpi( m )) 1 ° g ^~^ 0gP E Mp (|V/| 2 ). 

(The worst term is the last one since it always explodes as min(p, g) goes to zero). We thus 
see that in both cases (Poincare and logarithmic Sobolev inequalities), the problem can be 
reduced to the control of the mean-difference term (E^,/ — E Ml /) 2 in terms of E^(| V/| 2 ) 
for every smooth function /. Note that this task is impossible if an d \i\ have disjoint 
supports. 

Remark 4.2 (Finite mixtures). Let (fii)x^i< n be a family of probability measures on R rf . 
Consider the finite mixture fi = A4(v, (^i)i^i^ n ) with mixing measure v = p\5\ + - ■ ■+p n ^n- 
The decomposition of variance is a general fact valid in particular for fi, and writes 

Var^/) = E„(0 ^ Var w (/)) + Var„(0 » E w /) . 

Here again, the first term in the right hand side may be controlled with the Poincare 
inequality for each of the components (/Uj)i^^ n . For the second term of the right hand 
side, it remains to notice that for every g : = {1, . . . , n} — * R, 

Var^) = - J2p iPj (g(i) - g{j)f = ^PiPj{g{i) - g{j)f 
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which gives for g = E^ e (/) the identity 

Varv(E M J) = ^ Wj (EJ - E^./) 2 . 

i<j 

As for the two- component case, this further reduces the Poincare inequality for \i to the 
control of the mean- differences (E^/ — E^./J in terms of E^(|V/| 2 ). An analogous 
approach for the entropy and the logarithmic Sobolev inequality can be obtained by using 
[14, Theorem Al p. 49] for instance. 

4.2 Control of the mean-difference in dimension one 

The following lemma provides the control of the mean-difference term (E Mo / — E Atl /) 2 in 
the case where no and Hi are probability measures on R (i.e. d = 1). 

Lemma 4.3 (Control of the mean-difference term in dimension one). Let /io and \i\ 

be two probability distributions on K absolutely continuous with respect to the Lebesgue 
measure. Let us denote by Fq (respectively F\) the cumulative distribution function and /o 
(respectively f\) the probability density function of [i§ (respectively If co(S) denotes 
the convex envelope of the set S = supp(/io) U supp(/xi), then, for any p £ (0,1), with 
Hp = PHi + QHo an d Q = 1 — P> we have 

(E, /-E m/ ) 2 ^I(p)E, p (/' 2 ) where I(p) = [ dx, 

and the constant I(p) cannot be improved. Moreover, the function p 1— ► I(p) is convex, and 

2m&x(p,q) I (2) ^ ^ ^ 2min(p, g/G) ' ^ 

Furthermore, if S is not connected then I is constant and equal to 00, while the convexity 
of I ensure that sup pg ( ,i) Hp) = max(I(0 + ), J(l~ )) where 

1(0+) = lim I(p) and 7(1") = lim I(p), 
p^o+ Pol- 
and I(p) < 00 for every p in (0, 1) if and only i/max(/(0 + ), < 00. 

Proof of Lemma \4-3[ For any smooth and compactly supported function /, an integration 
by parts gives for every 9 G {0, 1}, 

E Mfl /= / f(x)f e (x)dx = - [ f'(x)F (x)dx. 

JR JR 

Since F\ — Fq = outside co(5) we have 

E M0 /-E m /= / (F 1 (x)-F (x))f'(x)dx. 

Jco(S) 
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It remains to use the Cauchy-Schwarz inequality, which gives 

0W - E M1 /) 2 = ( [ F( ff ~ Fl(x) . /Wva+M&l 
\Jco(s) \fph(x) + QJo(x) J 

<I(p) [ f(x) 2 (pf 1 (x)+qf (x))dx = I(p)E^(f' 2 ). 
Jco(S) 

The equality case of the Cauchy-Schwarz inequality provides the optimality of I(p). The 
bound (HZD follows from 2min(p,g)(/ + /i)/2 sC p/i + qf < 2max(p,g)(/ + /i)/2. The 
other claims of the lemma are immediate. □ 



4.3 Control of the Poincare and logarithmic Sobolev constants 

By combining the decomposition of the variance and of the entropy given at the beginning 
of the current section with Lemma 14.31 and Lemma 14. 11 we obtain the following Theorem. 

Theorem 4.4 (Poincare and logarithmic Sobolev inequalities for two-component mix- 
tures). Let fiQ and \i\ be two probability distributions on R absolutely continuous with 
respect to the Lebesgue measure, and consider the two- component mixture p p = ppt\ + qpto 
with ^ p ^ 1 and q = 1 — p. If I(p) is as in Lemma \4.3\ then for every p £ (0, 1), 

Cpi(fip) < max(Cpi(/Jo),Cpi(>i)) +pql{p) 

and 

log q — log p 

Cgi(Hp) < max(CGiOo),C G i(/Ui)) H (pql(p) + max(C P i(/i ), C P i(//i))). 

q — p 

In particular, since sup pg ( 01 ) I(p) = max(/(0 + ), I(l~)) where I(0 + ) and I(l~) are as in 
Lemma \4-3[ we get the following uniform bound: 

sup CpiOp) ^ max(Cpi(^ ),Cpi(^i)) + - max(7(0 + ), I(l~)). 
pe(o,l) 4 

Moreover, if I(0 + ) < oo (respectively if I(l~) < oo) then 

limsup Cpi(Mp) < max(Cpi(^ ),Cpi( / ui)) 

p^0+ respectively 1~ 

The upper bounds given by Theorem 14.41 must be understood in [0, oo] since the right 
hand side can be infinite (in such a case the bound is of course useless). Additionally, by 
Lemma 14.31 the function p I(p) is convex, and it is possible that 7(1/2) < oo while 
max(/(0 + ), I(l~)) = oo. The following corollary provides a uniform bound on the Poincare 
constant of a two-component mixture in terms of 7(1/2) without using max(/(0 + ), I(l~)). 
This corollary has no immediate logarithmic Sobolev counterpart, as explained in the 
remark below following the proof of the corollary. 

Corollary 4.5 (Uniform Poincare inequality for two-component mixtures). Let pq and 
fi± be two probability distributions on R absolutely continuous with respect to the Lebesgue 
measure and consider the mixture pt p = p\i\ + q^o for every p £ [0, 1]. We have then 

max Cpi(/i p ) ^ max(Cpi(/i ),Cpi( / ui)) + \l 
P e[o,i] 2 

where I {1/2) is as in Lemma \4-3[ 
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Proof of Corollary \4-5\ We observe that thanks to (|17p , one has 

pql(p) = max(p, g) min(p, q)I(p) < -l( - 



) 



and Theorem 14.41 provides the desired result. 



□ 



Remark 4.6 (Blow-up of the logarithmic Sobolev constant). With the notations of Corol- 



Since (log(g) — log(p))/(g — p) goes to +00 at speed — log(min(p, g)) as mm(p, q) goes to 
0, we cannot derive a uniform logarithmic Sobolev inequality for two- component mixtures 
under the sole assumption that 7(1/2) < 00. Surprisingly, we shall see in the sequel that 
this behavior is sharp and cannot be improved in general for two-component mixtures. 

4.4 The fundamental example of two Gaussians with identical variance 

It was already observed by Johnson in [29|. Theorem 1.1 page 536] that for the finite 
univariate Gaussian mixture \i = piM(mi,r 2 ) + • • • + p n J\f{m n , r 2 ), we have 



where a 2 = (pxm\-\ hPn"4) — (Pl m :H \-Pnm n ) 2 is the variance ofpiS mi -\ hp n <W- 

This upper bound on the Poincare constant blows up as mini^j^ n pj goes to 0. Madras 
and Randall have also obtained [35j Theorem 1.2 and Section 5] upper bounds for the 
Poincare constant of non-Gaussian finite mixtures under an overlapping condition on the 
supports of the components. As for the result of Johnson mentioned earlier, their upper 
bound blows up when the minimum weight of the mixing law mini^j^ n j?j goes to 0. In the 
sequel, we show that the Poincare constant can remain actually bounded as mini^j^pj 
goes to 0. To fix ideas, we will consider the special case of a two-component mixture of two 
Gaussian distributions H{— a, 1) and AA(+a, 1). As usual, we denote by $ (respectively 
(p) the cumulative distribution function (respectively probability density function) of the 
standard Gaussian measure M(0, 1). 

Corollary 4.7 (Mixture of two Gaussians with identical variance). For any a > and 

< p < 1, let no = N(—a, 1) and fi± = M(+a, 1), and define the two-component mixture 
Hp = PfJ-i + QHo- Then 




+ max(Cpi(/i ), C P i(/ii)) 





Additionally, a sharper upper bound for p = 1/2 is given by 



Cpi(Mi/2) ^ 1 + a 



2$(a) - 1 
2(p(a) 
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Note that as a function of p, the obtained upper bounds on the constants are continuous 
on the whole interval [0,1]. The bound (JS|) expressed in the univariate situation implies 
that Cpi is always greater than or equal to the variance of the probability measure. Here, 
the variance of fj, p is equal to 1 + Aapq. Then the upper bound on the Poincare constant 
given above is sharp for any p £ (0, 1) as a goes to 0. 

Proof of Corollary \4-7\ Lemma [4. 31 ensures that p t— > I(p) is a convex function: let us have 
a look at I(0 + ) and which are here equal by symmetry. Since 



tp(x + u) du ^ 2a < 

-a 



ip(x + a) if x < —a, 
ip(0) if — a ^ x ^ a, 

k ty?(x — a) if a < x, 



one has 



/ W , + .)-»(,-. ) )» ifa 

s^4a 2 ( [ + Q \ dx + ip(0) 2 [ — — ^ — - dx + f <p(x - a) dx 

\J-oo <p{x-a) J_ a ipix-a) J +a 

/ , %( 4a 2 f~ a _(£±M! 1 , 1 f 2a £. , f + °° , , ' 

^ 4<r \e\ e 2 — _ H == / e 2 dx + / w(x) 

\ J-00 V^Jo Jo 

^4a 2 fd>(2a)e 4a2 + 4^e 2a2 + - 

Then, the first statement follows from Theorem 14.41 For the second one, by Lemma 14.81 
given at the end of the section, we have 

/1\ f <&(x + a) - <3?(x - a) _ , . ^, .. , 

II - ) =2 / — ) U${x + a) - ${x - a)) dx 

\2) J Rip ( x + a ) + <p(x-a) K V ; 1 n 

^ 2r a / (<£(x + a) - $(x - a)) dx 
Jr 

= 4ar a . 

This gives as expected 7(1/2) ^ 2a(2$(a) - l)/(f(a). □ 

The following lemma shows that 1(1/2) is related to some kind of "band isoperimetry" . 
Note that Lemma 14.31 provides a more general approach beyond the Gaussian case. 

Lemma 4.8 (Band bound). For any x £ M and any a > 0, 

$(x + a) - <3?(x - a) <3?(+a) - <5(-a) 
ip(x + a) + ip(x - a) tp(+a) + tp(-a) 

Moreover, this constant cannot be improved. As an example, one has t\ ~ 1.410686134. 

Proof of Lemma \4-8\ Assume that a = 1. Let r > and define for any i£l 

a(x) = $(x + 1) - $(x - 1) - t(lp(x + 1) + f(x - 1)). 
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One has a'(x) = iff r(l + x + (x — l)e 2x ) = e 2x — 1. Thus, either x = 0, or 

t -1 = /?(x) = — 1 + x coth(x). 

The function (3 is even, convex, and achieves its global minimum at x = 0. Therefore, the 
equation a!{x) = has three solutions {— x T ,0, +x r }, where x r > satisfies r/3(x T ) = 1. 
Since liut^-too a(x) = 0, one has a ^ on R if and only if q(0) ^ and a"(0) ^ 0. The 
condition a(0) ^ is fulfilled as soon as 

$(+!)- $(-1) 
" ^(+l) + ^(-l) 

whereas the condition a"(0) ^ holds for any r. The case where a ^ 1 is similar. □ 

Remark 4.9 (Relation with isoperimetry) . If A x = [x — a,x+a] thendA x = {x — a,x+a}. 
7/7 = jV(0, 1) then ^{A x ) = <£(x + a) — <3?(x — a) while ^ s {dA x ) = (p(x + a) + (p(x — a) where 
7 S is f/ie surface measure associated to 7, see e.g. Iffffi . Lemma \4-8\ expresses that for any 
A £ C a = {A x ;x £ R}, u>e /iaue 7(A) ^ T a j s (dA) and equality is achieved for A = Aq. 
Recall that the Gaussian isoperimetric inequality states that (ipo &~ 1 )( , y(A)) r y s (dA) for 
any regular AcR with equality when A is a half line, see e.g. \32$ and references therein. 

4.5 Gallery of examples of one-dimensional two-component mixtures 

Recall that if \x is a probability measure on R with density / > and median m then 

max(6„,6 + ) < CgiG") < 16max(6_,6+) (18) 



where 



and 



b + = sup/i([x,+oo))log ( 1+ 1 J f -j^rdy, 

x>m V 2 j u([x,+oo)) / / J m f(y) 

b_ = sup/i((-oo,g])log ( 1+ i\ ) / 77~T d ^ 



These bounds appear in Remark 7 page 9] as a refinement of a famous criterion by 
Bobkov and Gotze based on previous works of Hardy and Muckenhoupt, see also More 
generally, the notion of measure capacities constitutes a powerful tool for the control of 
Cpi and Cqi, see [38] and [HE]. In the present article, we only use a weak version of such 
criteria, stated in the following lemma, and which can be found for instance in [TJ Chapter 
6 page 107]. We will typically use it in order to show that Cqi(pi/jl + quo) blows up as p 
goes to or 1 for certain choices of iiq and n\. 

Lemma 4.10 (Crude lower bound). Let /j, be some distribution on R with density f > 
then for every median m of /i and every x ^ m, by denoting VP('u) = — ulog(u), 



f m 1 

15QCGi(li)>*(n(-oo,x])J J^dy- 
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In this whole section, no and /xi are absolutely continuous probability measures on 
R with cumulative distribution functions Fq and F% and probability density functions fo 
and fi. For every ^ p ^ 1, we consider the two-component mixture \i v = + g^o- 
The sharp analysis of the logarithmic Sobolev constant for finite mixtures is a difficult 
problem. Also, we decided to focus on some enlightening examples, by providing a gallery 
of special cases of fio and fii for which we are able to control the dependence over p of the 
Poincare and logarithmic Sobolev constant of fj, p . Some of them are quite surprising and 
reveal hidden subtleties of the logarithmic Sobolev inequality as min(p, q) goes to 0. . . The 
key tools used here are Theorem 14.41 and Lemma 14.101 

4.5.1 One Gaussian and a sub-Gaussian 

Setting. Here Hi = M(0, 1) while fj>0 is such that fo ^ nf\ for some finite constant k ^ 1. 
Claim. For every < p < 1 we have Cpi(// P ) ^ max(l, Cpi(/j>o)) + Dq. This upper bound 
goes to max(l, Cpi(jUo)) as p — > 1 and is additionally uniformly bounded when p runs over 
(0, 1). Similarly, Cgi(/^ p ) ^ a — /31og(p) for some constants a,f3 > which do not depend 
on p. This upper bound blows up at speed — log(p) as p — > 0. This is actually the real 
behavior of Cgi(^p) m some situations as shown in Section 14.5.41 

Proof. Since fii = M(0, 1), we have Cpi(ni) = 1 and Cgi(^i) = 2. By hypothesis, we 
have -Fo ^ kF\ and 1 — Fq ^ nil — F\). Thus, for some D > and every < p < 1, 

Now Theorem 14.41 shows that Cpi(/i p ) ^ max(l, Cpi(/io)) + Dq. The desired upper bound 
for Cgi(^p) follows by the same way and we leave the details to the reader. 

4.5.2 Two Gaussians with identical mean 

We have already considered the mixture of two Gaussians with identical variances and 
different means in Section 14.41 Here we consider a mixture of two Gaussians with identical 
means and different variances. It turns out that this Gaussian mixture is a simple Gaussian 
sub-case of Section [4.5-H for which we are able to provide a more precise bound for Cqi- 
Setting, hi = Af(0, a 2 ) with a > 1 and /i = AA(0, 1). 
Claim. There exists C > such that, for any p < 1/2, 

J(p) < c( -J " ^ and Cpi(fi p ) ^a 2 + Cp^ . 
Moreover we have sup pe ( 0i i) Cpi(/i p ) < cxd. 

Proof. We have fo ^ nfi for some n > 1 , and we recover the setting of Section 14.5.11 Let 
us provide now an upper bound for J(p) when p is close to 0. We have pfi(x) ^ qfo(x) if 
and only if \x\ ^x p where 
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We have, for some constant C > 0, 



I(p) < 2 f ' ^ + 2 T 



x 2 p/i(a;) +g/o(a;) 

since 2g ^ 1 and -Fi(x) ^ /i(x)/|x|. If p is sufficiently small then x p > 1 and 

f- 1 1 h{xf r 1 1 h{xf 1 _ 

/ ~2 f ?~S i — TT^ dx ^ 2 ~2 t ( \ dx + -F 1 (-Xp). 
J-oo x z pfi{x) + g/ (x) J-x p x M x ) V 

By the definition of x p , for some C > 0, 

cr 2 -2 

P P VP/ 

If a 2 ^ 2, then this function of p is bounded. On the other hand, for some C > 0, 



-Xp x ^ fo{ x ) J —Xp X ^ 

If a 2 ^ 2, then this function of p is bounded. If a 2 > 2, then, for some C > 0, 

ct 2 -2 

As a conclusion, if a 2 ^ 2, then sup pg ( 0) i) I(p) < 00 > whereas if <r 2 > 2, then for some 
constant C > and any p < 1/2, 



The bound of Cpi follows from Theorem 14.41 For the logarithmic Sobolev inequality, one 
may use the Bobkov-Gotze criterion. 

4.5.3 Two uniforms with overlapping supports 

Setting. Here no =14 ([0,1]) and m = U([a, a + 1]) for some a S [0, 1]. 
Claim. For every p £ (0, 1), we have 



Cpi(np) ^ 7r 2 + ^—(3pq(l - a) + a 



2 



p > — '3 
and for p ^ 1/2, 

a 2 



C Gl (v p )>— log(l/p). 
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Proof. It is known (see e.g. [1 9J ) that C P i(W([0, 1]) = 7r~ 2 while C G i(W([0, 1]) = 2vr~ 2 . By 
translation invariance, we also have Cpi(W([l, 1 + a]) = ir~ 2 and Cgi(W([1, 1 + a]) = 2ir~ 2 . 
The desired result follows from Theorem 14.41 since for p E (0, 1), 

fx 2 , f 1 a 2 , f a+1 (1 + a - x) 2 , a 2 . , . , 

J(p) = / — dx + dx+ / — dx = (3po(l-o)+o). 

Jo Q Ja P + Q J i P 3pq 

The minoration of Cgi (/■%>) follows from Lemma 14.101 

4.5.4 One Gaussian and a uniform 
Setting. Here m =jV(0,l) and /z =W([-1,+1]). 

Claim. There exists a real constant C > such that Cgi(p p ) ^ — Clog(p) for every 
p E (0,1). Also, Cgi(a* p ) blows up at speed — log(p) as p — > + . Moreover, fj, p satisfies 
a sub-Gaussian concentration of measure for Lipschitz functions, uniformly in p E (0, 1). 
This similarity with the Bernoulli law B(p) suggests that the blow up phenomenon of 
Cgi(a*p) is due to the asymptotic support reduction from R to [— 1,+1] when p goes to 
+ . Actually, Section f4.5.5l shows that this intuition is false. 

Proof. We have /o ^ nf\ for some constant k 1. Also, for every p E (0, 1), the result 
of Section [4.5.11 gives that Cgi(/^) ^ a — (3log(p) for some constants a > and j3 > 
independent of p. Now, by Lemma 14. 10|. 

r° i 

150(7 G i(p > *(pFi(-2) + 9 F (-2)) / . — rfn 

J-2PJ1W + 



+ qfo(u) 

> -( Fi( - 2) £ 1 sk <i ") logW ' 



4.5.5 Surprising blow up 

Setting. Here /i(a;) = Z 1 _1 e _a;2 and /o(x) = Z^e"^ 0, for some fixed real number a > 2, 
with Zi = and Zq = 2T(a~ 1 )a^ 1 . Note that has lighter tails than \i v with p > 0. 

Claim. There exists a real constant C > which may depend on a such that 



C GI (^)^C(-log(p)) 



l-2<r 



for small enough p. In particular, Cgi (/•%>) blows up as p — > + . 

Comments. As mentioned in the introduction, we have max(CGi(/io)> Cgi(pi)) < 00 • 
We have seen in Section [4.5.21 that Cgi(^ p ) does not blow up as p — > + if a = 2. Here 
a > 2, and po has strictly lighter tails than p p for every p E (0, 1), and moreover, this 
difference is at the level of the log-power of the tails, not only at the level of the constants 
in front of the log-power. The potential (-log-density) of \i v has multiple wells, see Figure 
[TJ This example shows also that the blow up speed of Cgi(m p ) as p — > + cannot be 
improved by considering a mixture of fully supported laws. Note that no — > U([—l,+l]) 
as a —* 00, and the result is thus compatible with Section [4,5.41 
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Proof. Since /q ^ nfi for some constant k ^ 1, Section 14 . 5 , 1 1 gives Cgi (/•%>) < oo for every 
p € (0, 1). Moreover, p i— > Cqi (/•%>) is uniformly bounded on (po> 1) for every po > 0. Let us 
study the behavior of this function as p — > 0. In the sequel we assume that p < po where 
Po satisfies PqZq = q§Z\. The immediate tails comparison gives qfo(x) ^ pfi(x) for large 
enough x. Let us find some explicit bound on x. The inequality qfo(x) ^ pfi(x) writes 
\x\ a — x 2 ^ log(gZi) — log(pZo). Now, |x| a — x 2 ^ ^|^| a for |x| a ~ 2 2. The non-negative 
solution of ^|x| a = log(gZi) — log(pZo) is 

If p is small enough, then |x p | a_2 ^ 2 and therefore, qfo(x) ^ pfi(x) for any |x| ^ x p . 
Now, by Lemma 14.101 f° r small enough p, 

,o j 

150 Cgi(Hp) > *(pFi (-2x p ) + qF (-2x p )) / d«. 

J-2x p PJi{u) + qfo(u) 

For small enough p, we have max(i ? o, 2x p ) < e _1 and thus, for some constant C > 0, 

— 4x 2 

*(pFi(-2x p ) + gF (-2x p )) ^ *(pF 1 (-2x p )) ^ -pifi(-2x p )log(p) > C^r^*0). 

Xp 

On the other hand, since qfo(x) ^ pfi(x) for |x| ^ x p , we get for some constant C > 0, 



/_: 



cfoi Ce p 
du > / ^—r > 



-2x p Ph{u) + g/oO) J~2x p 2 Pfl( u ) PXp 

Consequently, for some real constant C > 0, 

,log(p) 



150C G i(/i P ) ^ -C- 



-2 

P 



Now, by using the explicit expression of x p , we finally obtain for some real constant C > 0, 

Cgi(Mp) ^C(-log( P )) 1 - 2a " 1 . 



4.6 Multivariate mean-difference bound 

It is quite natural to ask for a multidimensional counterpart of the mean-difference Lemma 
14.31 Let us give some informal ideas to attack this problem. Let no and ni be two 
probability measures on and consider as usual the mixture \i p = pp,i + qno with 
p G (0, 1) and q = 1 — p. It is well known (see for instance [51]) that if no an d ni are 
regular enough, then there exists a map T : R rf — ► R rf such that the image measure T • no 
of no by T is fii an d 

^(Mo, Ail) 2 = / \T(x) - x\ 2 n (dx). 
If fj,( s ) denotes the image of no by x i— > sT(x) + (1 — s)x for every < s < 1, then 

(E m / - E M0 /) 2 = f / / (T(x) - x) • Vf(sT(x) + (1 - s)x) ^ (x) 
\J0 JR d 
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Figure 1: Density and second derivative of —log-density of \x v for Example 14.5.51 with 
p = 1/100 and a = 4. The second plot reveals a deep multiple wells potential. 



By Cauchy-Schwarz's inequality, we get 

(E m / - E/j /) 2 < ( / \T(x) - x\ 2 d^{x) 

\JR d 



JR d 



\Vf(x)\ 2 d Hs) (x)ds 



and therefore 



(E /tl /-E M0 /) 2 ^^ 2 ( m , M ) 2 / f \Vf(x)\ 2 d Hs) (x)ds. 

JM. d J0 

This shows that in order to control the mean-difference term (E w / — E^,/) 2 by E Mp (| V/| 2 ), 
it is enough to find a real constant C p > such that ~p $J C p ^ p where 



/1(A) = / n (s) (A)ds. 
Jo 

Unfortunately, this is not feasible if for some s £ (0, 1), the support of /Z( s ) is not included 
in the support of fj, p (union of the supports of fio and /jli if p € (0, 1)). This problem is 
due to the linear interpolation used to define (j,r s \ via T. The linear interpolation will fail 
if the support of \x v is a non-convex connected set. Let us adopt an alternative pathwise 
interpolation scheme. For each x G Sq = supp(/xo), let us pick a continuous and piecewise 
smooth interpolating path ^ x : [0, 1] — > M d such that 7^(0) = x and 7^(1) = T{x). Then 
for every smooth / : M — > R, 



/(*) - /(T(x)) = j% x (s)Vf( lx (s))ds ^ \lf\Us) 



ds 



\Vf\>{ lx {s))ds. 



25 



As a consequence, we have 

(E M() /-E m /) 2 ^ ( [ f'lU^dsMdx)) ( [ f\vf\\ lx {s))ds^{dx) 
\Js jo J \Js Jo 

Now, let /x/ s ) be the image measure of hq by the map x *— > "f x ( s )i where here again ~p is 
the measure defined by Ji(A) = J^fiu^A) ds. With this notation, we have 

(E M() /-E Ml jf ^ ( [ f'lU^dsMdx)) ( [ \Vf\ 2 (x)ji(dx) 



'So 
Note that 

/ / \jx(s)\ 2 ds fi (dx) ) ^ W 2 (no,Vi) 2 
I S JO J 

with equality when ^ x is the linear interpolation map between x and T(x) for every x G -So- 

The mean-difference control that we seek for follows then immediately if there exists a real 

constant C p > such that Ji ^ C p fi p . The problem is thus reduced to the choice of an 

interpolation scheme 7 such that the support of J2 is included in the support of fi p (which 

is the union of the supports of fio and as soon as < p < 1). Let us give now two 

enlightening examples. 

Example 4.11 (When the linear interpolation map is optimal). Consider the two-dimen- 
sional example where fiQ = U([0,2] x [0,2]) and fi\ = W([l,3] x [0,2]). If ^ is the natural 
linear interpolation map given by 7 x (s) = x + se\ then [xr s \ = U([s,s + 2] x [0,2]) is 
supported inside supp(/io) Usupp(/xi). This is due to the convexity of this union. Also, the 
linear interpolation map is here optimal. Moreover, elementary computations reveal that 

C p = ^-^ r and W 2 (Mo,Mi) 2 = !■ 

imn(j), q) 



(E,J-E,jf^—-^B, p (\Vf\ 2 ). 



Therefore, for every < p < 1 and any smooth f : 

n2 < 

min(p, q) 

Example 4.12 (When the linear interpolation map fails). In contrast, for the example 
where n$ = U([0, 2] x [0,2]) and m = W([l,3] x [1,3]) and if 7 is the natural linear 
interpolation map given by r y x {s) = x + s(e\ + e 2 ) then pLi s \ is not supported in supp(^o) U 
supp(/xi) and this union is not convex. If A = [0,1] x [2,3] then fi^(A) > for every 
< s < 1 while n p (A) = for every < p < 1 and hence there is no finite constant 
C p > such that Ji ^ C p [i p . This shows that the linear interpolation map fails here. Let 
us give an alternative interpolation map which leads to the desired result. We set for every 
x G supp(^o) and every ^ s ^ 1, with 1 = (ei, e\), 



(l-s)x + 2sl i/0O^i 
sx + 1 otherwise. 



This corresponds to a two-steps linear interpolation between the squares [0,2] 2 and [1,3] 2 
with intermediate square [1,2] 2 . For every ^ s ^ 1, 



'^([2 S ,2] 2 ) i/0<a<| 
W([l,l + 2s] 2 ) otherwise. 
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Note that we constructed 7 in such a way that fj,r s \ is always supported in supp(/io) U 
supp(/Ui). Elementary computations reveal that for every < p < 1, 




Finally, putting all together, we obtain for every < p < 1 and smooth f : M? — > M, 



As a conclusion, one can retain that the natural interpolation problem associated to the 
control of the mean- difference involves a kind of support- constrained interpolation for mass 
transportation. 
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